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(57) ABSTRACT 

Methods and apparatus are provided for ensuring that que- 
ries presented to a database are safe thus providing solutions 
to both undecidability and lack of effective syntax issues. In 
a one aspect of the invention, a method for use in a database 
system includes obtaining an original query entered by a 
user. The invention then provides for pre-processing the 
query before submittal to a database engine associated with 
the database system wherein a result of the pre-processing 
operation is to ensure that a query provided to the engine is 
safe. Safety has a different meaning depending on the 
application. For example, safety may include ensuring that 
a query will return an output with finite results or it may 
include ensuring that certain geometric properties in the 
query are preserved in the output. 
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r enters Query 'UnsotoV into user interface, 
where Unsafe is; 

"SELECT XI =U, 

Y-rWS.Volue 
FROK % POMS 



User interface sends tluofet to the query pre- processor, 

which notices that the output t\' Is not 
restricted by a relation. Hence pre -processor instructs 
user-eiterfcce to prompt user for restrictions on XI 



User inputs 

(specifies that we should 
onfy consider values i such 
that s=Vy for some y in POEHTS.Vclue) 



Pro-processor generates query 'Safe 1 1 produced by 
restricting output of Unsafe I to values satisfying 
the restriction. 

Safe1= 

"stun xi=u 

Y=POIHT$.Ydui 
FROM St. POMS 

where: xi>r<+2 

AMD X1 2 -P0Di!S.Volue=<r 



J 
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User enters Query 'Unsafe 1' into 
interface, WHERE Unsafe is 

"SELECT X1=R.X, 

X2=P0INTS1. Value, 

X3=P0INTS2.Value 
FROM R, POINTS POINTS 1, POINTS P0INTS2 
WHERE X1*(X2-1)=X3" 



Pre-processor analyzes Unsafel and produces 
query 'Safe2' by adding restriction 

"SELECT X1=R.X, 

X2=P0INTS1 .Value, 
X3=P0INTS2.Value 
FROM R, POINTS P0INTS1 , POINTS P0INTS2 
WHERE X1*(X2-1)=X3 
AND NOT EXISTS 

(SELECT Yalue1=P1. Value, Value2=P2.Value 
FROM POINTS PI, POINTS P2 
WHERE Valuel = 1 AND Value2=0)" 







User queries database with Safe2 via User Interface 
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FIG. 3A 



User enters Query ' Unsafe 1' into user interface, 
where Unsafe is: 

"SELECT X1=R.X, 

Y=POINTS.Volue 
FROM R, POINTS 
WHERE (X1>Y 2 +2)" 



310 



User interface sends Unsafel to the query pre-processor, 

which notices that the output "XI" is not 
restricted by a relation. Hence pre-processor instructs 
user-interface to prompt user for restrictions on XI 



320 



User inputs "XI 2 -POINTS.Value=0° 

(specifies that we should 
only consider values x such 
that x=Vy for some y in POINTS.Value) 



■330 



Pre-processor generates query 'Safel ' produced by 
restricting output of Unsafel to values satisfying 
the restriction. 



340 



Safel = 

"SELECT X1=R.X, 

Y=POINTS.Value 
FROM R, POINTS 
WHERE X1>Y 2 +2 
AND X1 2 -P0INTS.Value=0" 



TO FIG. 3B 
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FIG. 3B 



FROM FIG.3A 



User queries database repeatedly with Safel, by sending 
Safel back to the database engine via the user interface. 
User Interface gives response back from database engine. 
E.g.: 

POINTS Table 

Value 
25 

9 => Safel returns: XI =5 Y=0 

4 XI =3 Y=1 

1 XI =3 Y=0 

0 



01/04/2004, EAST Version: 1.4.1 



U.S. Patent Mar. 13, 2001 Sheet 5 of 6 



US 6,202,063 Bl 



FIG. 4A 



User inputs query SofeRelational, which applies to 
schema with input table VERT, which is a finite 
relational table with 6 attributes: 
vlxval, vlyval, v2xval, v2yval, v3xval, v3yval 

SofeRelational outputs a table with 6 columns as well. 
SafeRelational= 

"SELECT U1=R.X, 
U2=R.X, 
V1=R3.X, 
V2=R4.X, 
W1=R5.X, 
W2=R6.X 

FROM R R1, R R2, R R3, R R4, R R5, R R6, VERT 
WHERE U1=VERT.v1xval+5 AND 

U2=VERT.v1yval*2 AND 

Vl=VERT.v2xval+5 AND 

V2=VERT.v2yv#2 AND 

W1+VERT.v3xval+5 AND 

W2=VERT.v3yval*2" 



410 



Query pre-processor accepts SafeRelationaM as is, 
since it requires no modification to be made safe. 



420 



User creates schema Figure of type Region (stores 
triangular regions in the plane) and inputs some 
sample data 




430 



TO FIG. 4B 
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FIG. 4B 



FROM FIG.4A 



User forms query SafeGeometric: 
"Apply SafeRelational to Vertices(Region)" 

Pre-processor generates a spatial query 
SafeGeometric2, which takes any region and 

returns the region obtained by taking all 
triangles making up the region, moving them 
5 places to the right and doubling their size 







User Applies SafeGeometric to Figure by sending 
SafeGeometric2 to database engine via user interface. 



440 



450 
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METHODS AND APPARATUS FOR providing solutions to both the undecidability and lack of 

GENERATING AND USING SAFE effective syntax issues mentioned above. In a broad aspect 

CONSTRAINT QUERIES of the invention, a method for use in a database system 

includes obtaining an original query entered by a user. The 
CROSS REFERENCE TO RELATED 5 mv ention then provides for pre-processing the query before 
APPLICATIONS submittal to a database engine associated with the database 
The present application is related to U.S. patent applica- system wherein a result of the pre-processing operation is to 
tion respectively entitled: "Methods and Apparatus for Pro- ensure that a query provided to the engine is safe. As 
viding Aggregation Operations in Constraint Databases," mentioned above, safety has a different meaning depending 
filed concurrently herewith. 10 on the application. For example, safety may include ensur- 
ed n nc tub imvumtthm ing that a query will return an output with finite results or it 
FIELD OF THE INVENTION may i ncnide ensuring that certain geometric properties in the 
The invention relates to database management systems query are preserved in the output, 
and, more particularly, to methods and apparatus for gener- In a first embodiment of the invention, the pre-processing 
ating and using safe constraint queries in accordance with 15 step includes automatically modifying the original query to 
relational and spatial database management systems. ensure a return of finite results from the database by insert- 
BACKGROUND OF THE INVENTION ^ at least one range-restriction in the original query The 

range-restriction specifies an upper bound on the results to 
The power of classical query languages is linked to the be returned by the database as a set of roots of polynomials 
fact that they express a restricted class of declarative pro- 20 with coefficients coming from an active domain of the 
grams. The class of semantic objects expressible through database and a finite set of constants. Preferably, the range- 
queries in the relational calculus is limited in a number of restriction is inserted regardless of whether the original 
helpful ways, for example, each such query is polynomial- query would have returned a finite result, 
time computable. Although relational calculus queries may i n a second embodiment, the original query is a conjunc- 
not return finite results, a natural subclass of the relational 25 tive query and the pre-processing step includes analyzing the 
calculus does, namely, the class of range-restricted queries. original conjunctive query to determine if the original con- 
This class gives guarantees of finite output and is complete junctive query would return finite results from the database, 
in that it captures all relational calculus queries whose Further, the pre-processing step includes prompting the user 
outputs are always finite, i.e., safe queries. The class of t o restrict the original conjunctive query, if the original 
range-restricted queries provides an effective syntax for safe 30 q Ue ry would not return finite results, by inserting at least one 
queries, as every safe query is equivalent to a range- range -restriction in the original query. Again, the range- 
restricted one. However, the question of whether a relational restriction specifies an upper bound on the results to be 
calculus query is safe or not is known to be undecidable in returned by the database as a set of roots of polynomials with 
conventional database systems. coefficients coming from an active domain of the database 

The relational theory on which these results are based 35 and a ^ n itc set of constants, 

deals only with pure relational queries, that is, those con- \ n a third embodiment, the original query contains at least 

taining no interpreted predicates. Practical query languages, 0 ne geometric property and the pre-processing step includes 

in contrast, contain interpreted functions such as + and *. coding the original query with effective syntax to ensure 

The resulting queries, then, make use of the domain preservation of the at least one geometric property in results 

semantics, rather than being independent of them as are pure 40 returne d from the database. Thus, whenever a query includes 

relational queries. For example, if the underlying structure is a certain type of geometric object, the result returned by the 

lu^u^i , /« *r.^\ L . r database engine will include only that type of geometric 

the field of real numbers +, *, 0,1, </, the extension of 0D ject 

relational calculus is achieved by using polynomial (in) ^ ' . . . - , . .. . . . 

A „„ a r**~~ ~~ ~ 1 fu \f \ ri n/ T" e invention applies to the standard relational calculus 

equalities. For example, the query <b(x, y)s3z-R (x, z)aR(z, 45 - tU ■ . , A 1 r *u 

t ,? Av 2 lll 2 „ Ae ><i u ♦ 1/' • *♦ -*u Wltn interpreted functions on finite structures; we then 

y)Ax +\r=z defines a subset of the self -join operation with ,. , 4 . 1* * * * ^ 1 * • 

\u A i' *u 4 • ■ • ui * 1 / \ ^ / \ . applied these results to get structural restrictions on the 

the condition that in joinable tuples (x, z) and (z, y), z must . f . c . • . j * u j 1 * 

, f J f a a / t *u behavior of queries in the constraint database model of 

be the sum 01 squares 01 x and y. A natural question, then, ^ n 1 • v j r» «^ . • . ^ 

u 4 * n 1 4 ■ • Kanellakis, Kuper and Revesz, "Constraint Query 

is what sort of restrictions still apply to queries given with T T , c „ . 1 « 0 , . JL 

interoreted structure , n Languages, Journal of Computer and System sciences, 51 

interpreted structure. , (1995), 26-52 (extended abstract in PODS'90, pp. 

A problem related to safety is that of state-safety, that is, 2 99-313). This model is motivated by new applications 

for a query and a database, determine if the output is finite. mvolving spatia i and temporal data, which require storing 

Unlike the safety problem which are undecidable, the aod querying infinite ^1^^ It extends the re ktional 

state-safety problem is decidable for some domains, such as model b means of « generalized relations » m pos . 

natural numbers with the order relation. However, there are 55 sMy mfiaite ^ defined b quantifier . free firstK)rder for . 

interpreted structures (even with decidable first-order muIae in the la of some underlying infinile structure 
theory) for which this problem is undecidable and for which 

the class of safe queries cannot be described syntactically. M=\U,Q/. Here U is a set (assumed to be infinite), and Q is 

Accordingly, there is a need for methods and apparatus for a signature that consists of a number of interpreted functions 

generating relational calculus queries in the context of 60 and predicates over U. For example, in spatial applications, 

certain types of database systems that are considered safe 1 1 

and thus solve the issues of undecidability and the lack of M ^ usually the real field \SR, +, *, 0,1, </, and generalized 

effective syntax. relations describe sets in 9T. 

SUMMARY OF THE INVENTION 65 A database given by a quantifier-free formula 

mi ♦ ■ j c .1 j j ct(x lf . . . ,x_) in the language of Q defines a (possibly 

The present invention provides for methods and apparatus v & & j 

for ensuring that queries presented to a database are safe thus infinite) subset of U w given by D a ={ cT e U" | M * a (a)}. 
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Such databases are called finitely representable, as the spatial locations. Examples of spatial databases are corn- 
formula a provides a finitary means fox representing an puter aided design (CAD) databases (i.e., database for 
infinite set. For example, if a(x, y)s(x 2 +y 2 ^l), then D a is storing design information about how objects are 
the circle of radius 1 with the center in (0, 0). constructed) and geographic databases (i.e., database for 

Relational calculus can be extended in a straightforward 5 storing geographic information such as, for example, maps) 

manner to this model, by incorporating atomic formulas referred to as Geographic Information Systems (GIS). Infor- 

which are Q^constraints, that is, atomic Q-formulae. For mation associated with spatial databases is typically made 

example, <(>(x, y)=(D (x, z)Ay«x 2 ) is a first-order query up of geometric constructs (e.g., lines, triangles, polygons, 

which, on D a defined above, returns the intersection of the etc )- I* is t0 De appreciated that the teachings of the 

circle with the graph of the function y=x 2 . 10 invention discussed herein are not limited to these specific 

For the constraint model, the invention enforces safety in examples but rather are also applicable to all databases 
the sense of the preservation of restricted geometric classes ha y m S ^rmation with properties similar to geometric 
of databases within powerful constraint query languages. In formation, e.g. integer databases that support addition, 
particular, we deal with enforcing that polynomial constraint niultiphcauon and/or ordering operations. It is also to be 
queries preserve linear constraint databases and subclasses 35 noted thal the inventlOQ 15 applicable to relational databases, 
thereof. Linear constraints are used to represent spatial data ^ Q ™> » relational database is a database that stores 
in many applications. Linear constraints have several advan- dat ? m form of relatlon *' wm ? ma * .** v "; wed * S 
tages over polynomial: the quantifier-elimination procedure ^bles of data items An attribute of a relation refers to a 
is less costly, and numerous algorithms have been developed s P r ecific ate W , of J data . in a relatl0n " n * term " m P le " 
to deal with figures represented by linear constraints. At the 20 re f ers to a Particular data item or record stored in a table or 
same time, the extension of relational calculus with linear other *Pe of relaUon * * query* is a request for information 
constraints has severely limited power as a query language. m ^ form entered bv a t0 me database s y stem ' 
Thus, it is natural to use a more powerful language, such as Further, as used herein, the term "processor" is intended 
relational calculus with polynomial constraints, to query t0 include any processing device, including a CPU (central 
databases represented by linear constraints. We also consider 25 processing unit) and associated memory. Accordingly, soft- 
polynomial constraints over the smaller class of convex ware instructions or code associated with implementing the 
polyhedra, the latter being an even more restricted class that methodologies of the present invention may be stored in 
still captures spatial figures that are important for geometric associated memory and, when ready to be utilized, retrieved 
modeling. and executed by an appropriate CPU. 

Since the class of constraints used in queries is more 30 Referring initially to FIG. 1, a block diagram of a rela- 

general than the class used to define databases, we encounter tional spatial database management system according to an 

the safety problem again: the output of a polynomial when illustrative embodiment of the present invention is shown, 

applied to a linear constraint database may fail to be a linear The system 10 includes a user interface and control unit 14 

constraint database, and similarly the output of a polynomial 5 wi* 0 which a user 12 of the system interacts to enter queries 

constraint query on a convex polyhedron may fail to be a and view results associated with the database system 10. The 

convex polyhedron. Generally, if spatial databases are user interface can include any conventional computer input/ 

required to have certain geometric properties, then the safety output devices. For example, the user interface and control 

problem is whether those properties are preserved by a given umt 1* may include a keyboard and/or mouse for entering 

query language. The invention provides means for enforcing 4Q afl d modifying queries and data and a video terminal for 

that polynomial constraint queries do preserve geometric viewing query results and other data associated with the 

properties, in particular for the classes of convex polytopes, otner components of the system 10. 

convex polyhedra, and compact semi-linear sets. It also The database system 10 also includes a query pre- 

provides means to determine whether a conjunctive poly- processor 16 coupled to the user interface 14. As will be 

nomial constraint query does preserve the classes listed 45 explained in detail below, the query pre-processor 16 per- 

above. forms query analysis and translation operations according to 

the invention. For instance, the pre-processor 16 analyzes a 

BRIEF DESCRIPTION OF THE DRAWINGS use r input query to determine if the ranges associated with 

FIG. 1 is a block diagram of a relational spatial database the WW are restricted to ensure safety. Also, the pre- 

management system according to an illustrative embodi- 50 Processor 16 performs transformations (i.e., translates) on a 

ment of the present invention; user in P ut WW t0 return a ^ uerv that is restricted to ensure 

FIG. 2 is a flow diagram of a method of automatic safe . . * A , . * j 

query translation according to an illustrative embodiment of ^ da * abase ^ m ™ also inc,udes a ^ at,aI database 

the present invention; en S ine 18 coupled to the user interface 14. As will be 

rrr-c iAj^r>-a i • c , r c 55 explained in detail below, the spatial database engine 18 

FIGS. 3 A and 3B is a flow diagram of a method of safe COQtrols the of spatia / mformation such ^ Sj for 

query analysis and user-input query restriction according to x p oint and R ioQ data m a data store 2Q w 

an illustrative embodiment of the present invention; and meret0 ^ the ^ database eQgine lg 

FIGS. 4A and 4B is a flow diagram of a method of queries received from the user interface 14. 

effective syntax query translation according to an illustrative 60 , t is t0 be appreciated that lhe components depicted in the 

embodiment of the present invention. rdational spatiaI databage managem £ t system P 10 may be 

DETAILED DESCRIPTION OF PREFERRED implemented in accordance with one or more general or 

EMBODIMENTS special purpose processors. Depending on the amount and 

^ . ^ , ^, j complexity of the spatial data, the system 10 may be 

The present invention is described below in the context of 65 embodied in a variety of processor-based machines, e.g., one 

generation and use of safe queries in a spatial database. As or more personal computers, workstations, microcomputers, 

is known, spatial databases store information related to etc. 
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For ease of reference, the remainder of the detailed The semantics are as follows. For a structure M and a 

description will be divided into sections as follows: (I) SC-instance D, the notion of (M, D) N <j> is denned in a 

Notations; (II) Data Structure Qualities; (III) Automatic Safe stan dard way for FO (SC, Q) formulae, where 3x Eadom is 

Que? Reaction; ( W^S^^^^^n 5 the active-domain quantification. TTiat is, (M, D) * 3x <|>(x,-) 

Via Preservation of Geometric Properties. if for some a e U we have (M, D) * <t>(a, •)> and (M, D) 

I. Notations t 3x6 adom <|>(x , •) if for some a E adorn (D) we have (M, 

It is to be appreciated that new applications of database D) t ^ If M is understood, we write D i= 6. The output 

technology, such as Geographical Information Systems _^ 

(GIS), have spurred a considerable amount of research into of a query ^(x l9 . . . ,x J on D is { a»(a 1 , . . . ,aj 6 U" |D 

generalizations of the standard relational database model to , , . . , ir ^-. ^ i y, x 

deal with the manipulation of geometric or spatial data. One \ ♦(*)}. \ » * mo " \ For example, «x,y). 

common approach to modeling spatial database is to con- ( S ( x ^) A V= x * x ) 1S a ^?OLY query over the schema 

sider input databases as given by a set of well-behaved that contains one binary relation S; and (|)[D] is the set of 

relations in euclidean space, for example, by a set of P*irs in (x, y) in D where y«x 2 . 

semi-linear or semi-algebraic sets. There are a number of 15 We now that a F0 ( SC> q) query tff) & safe for a 

proposed query languages that extend classical relational SC -database D if it has finitely many satisfies for D; that is, 

algebra to this setting, languages that allow the use of ^ ^ fidte A is safc tf it is safe for all data5ases> 

various geometric operations in manipulating spatial data- As previously explained, we need to distinguish a class of 
bases. One of the most well-developed models for spatial u . i„ r A , „, .°. ... , 
. . . . j . , _ r , . , / . n nn well-behaved models. We use o-minimahty and 
queries is the constraint database model as disclosed in P. 20 . x . . . . 
Kanellakis, G. Kuper, and P. Revesz, "Constraint Query quantifier^limmation as are known in the art for this 
Languages," JCSS, 51 (1995), 26-52, extended abstract in P ur P ose - We . ^ ] hat M 15 o-minimal, if every definab e set 
PODS'90, pages 299-313. In this model, spatial databases 15 a fimte umon of P omts and °P en intervals {x | a<x<b} (we 
are represented as sets of linear or polynomial constraints. assume that < is in Q). Definable sets are those of the form 
Databases are queried using standard relational calculus 2 $ { x I M»= cj)(x)}, where $ is a first-order formula in the 
with linear or polynomial inequalities as selection criteria. language of M, possibly supplemented with symbols for 
These languages, denoted by FO+LIN and FO+POLY, have constants from M. We say that M admits quantifier- 
become the dominant ones in the constraint database litera- . A . r r i « 
tare. FO+EXP+POLY is another. Tliese languages have a ^^on (QE) if, for every formula « x), there is an 

very important closure property: the application of a 30 equivalent quantifier-free formula t|>(x ) such that Mi= V 

FO+LIN query to a linear constraint set yields a new set of -» 

linear constraints; similarly FO+POLY queries on sets defin- X '<K X > ~ ^ x )' r Below ™ hst the mos J ^portant 

able with polynomial constraints produce sets that can still examples of classes of interpreted structures and constraints 

be defined with polynomial constraints. often used in applications: 

The notations we use are fairly standard in the literature 35 Linear Constraints: \% +, 0,1, <) is o-minimal, has QE, 

on databases. We study databases over infinite structures. an d its first-order theory is decidable. 

Let M=(u,q) be an infinite structure, where U is an infinite Polynomial Constraints: The real field {% +, *, 0,1, <) is 

set, called a carrier (in the database literature it is often o-minimal, has QE, and its first-order theory is decidable. 

called domain), and Q is a set of interpreted functions, ^ This follows from Tarsia's theorem, as is known in the art. 

constants, and predicates. For example, for the real field {% Exponential Constraints: (sR, +, *, e*, 0,1, <) is o-minimal. 

+, *, 0,1, </, the carrier is SR (the set of real numbers), and The following are results about o-minimal structures that 

the signature consists of the functions + and *, constants 0 will be used herein: 

and 1, and predicate <. Fact 1.1 (Uniform Bounds) 

A (relational) database schema SC is a nonempty collec- 45 if M is o-minimal, and <j)(x, 7) is a first-order formula in 

tion of relation names {S,, . . . ,S 2 } with associated anties lhe language of M> possibly supplemented with symbols for 

p, . ,p 3 >0. Given M, an instance of SC over M is a family constanls from M> then there is an integer K such that, for 

of finite sets, {R u ... ,Rj, where R^IT. That is, each _^ & 

schema symbol of arity p, is interpreted as a finite p,-ary each vector a from M, the set {x | M t <|>(x, a)} is 

relation over U. Given an instance D, adom (D) denotes its 50 composed of fewer than K intervals, 

active domain, that is, the set of all elements that occur in the If only quantifiers Vx Eadom and 3x Eadom are used in 

relations in D. We normally assume adom (DB) is not empty. a query, it is called an active-semantics (or active-domain) 

As our basic query language, we consider relational query. This is the usual semantics for databases, and it will 

calculus, or first -order logic, FO, over the underlying models be the one used in most of the results here. If quantification 

and the database schema. In what follows, L (SC, Q) stands 55 over the entire infinite universe is allowed, we speak of a 

for the language that contains all symbols of SC and Q; by natural -semantics query. Active -semantics queries admit the 

FO (SC, Q) we mean the class of all first-order formulae standard bottom-up evaluation, while for natural-semantics 

built up from the atomic SC and Q-formulae by using it is not clear a priori if they can be evaluated at all. 

Boolean connectives and quantifiers V, 3 and Vx However, in many cases one can restrict one's attention to 

Eadom, 3x Eadom .When Q is (+,-,0,l,<), (+,*, 0,1, <), or 60 active-semantics queries. The following result was first 

(+,*,e*, 0,1, <), we use the standard abbreviations FO+UN, shown for the pure case (no interpreted structure), for linear 

FO+POLY, or FO+EXP, respectively, often omitting the constraints, and then for a large class of structures as 

schema when it is understood. Regardless of whether we are follows: 

in the "classical" setting, where these queries are applied to Fact 1.2 (Natural -Active Collapse) 

finite databases, or in the constraint query setting discussed 65 If M is o-minimal and has QE, and $ is an arbitrary FO 

herein, we will refer to the syntactic query languages as (SC, Q) query, then there exists an equivalent FO (SC, Q) 

relational calculus with constraints Q. active-semantics query $ acr Moreover, if the first -order 
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theory of M is decidable and QE is effective, then the of the output of 4>- Let be the sentence: "" , 3x 1 , x 2 

translation $-*$ act is effective. ((x^x^Vx x 1 <x<x 2 -^a(x))) and let V be an equivalent 

We now define the classes of conjunctive queries (CQ), active-semantics sentence, as above. We then define § safe as 

unions of conjunctive queries (UCQ) and Boolean combi- (4> A ^- The construction then follows from the following 

nation of conjunctive queries (BCCQ) in the interpreted 5 statement: D $V if and only if 4>[D] is finite. Examples of 

setting. CQs are built up from atomic SC formulae and structures satisfying the conditions of the above construction 

arbitrary Q-formulae by using a and quantifiers 3x and 3x / \ / \ 

eadom. Note that we can always assume that parameters of are W+t- Al,</ and \% +» *» 0,1, </. 

each SC relation are variables, as Q-terms can be eliminated Corollary 2.1 

by using existential (active-domain) quantifiers. It is easy to 10 M "^mediate corollary to the above-proposition is as 

see that each CQ can be represented in the form: follows - ^ let M be as in the above Proposition. Then, 

the state-safety problem, previously explained, over M is 

■ _^ _^ _^ decidable. This is proven, as above, by demonstrating that 

<j>( z )=3 x 3 y eadom Sj( u J A ... S A ( u ^( x , y , z ) me active-semantics sentence W tests if <|>[D] is finite. 

. e» u i *- / . j • * * *\ K It is to be appreciated that safe translations, recursive or 

where S, s are schema relations (not necessarily distinct), 15 *T ; 

_^ not, need not exist even when one restricts one s attention to 

u , is a vector of variables from x , y , z of appropriate active-semantics queries, and all predicates in the signature 

arity, and y is a Q-formula. If Q-0 and 7-0, this is the usual Q m m computable. It is to be further appreciated that in the 

notion of conjunctive queries. If y is quantifier-free, this is remainder of the detailed description, we concentrate on 

another notion of conjunctive queries. 20 "well-behaved" structures, typically, o-minimal ones. For 

We define UCQs to be built up from atomic SC formulae computability, we often impose QE and decidability of 

and arbitrary Q-formulae by using a, v, and quantifiers 3x nrst-order theory, 

and 3x eadom. Again, it is easy to see that those are 111 Automatic Safe Query Translation 

precisely the queries of the form v ... where each In a embodiment, the invention provides for taking an 

is a CQ. Finally, BCCQ are arbitrary Boolean combina- 25 m P ut ^ entered ' b V a user and automatically translating 

tions of CQs mc °i uer y 1Q to a safe query. This is accomplished in accor- 

Although we could define active domain versions of dance ^ query pre-processor 16 (FIG. 1). As will be 
conjunctive queries, the results we state here (e.g., decid- explained and illustrated in the context of FIG. 2, the query 
ability of safety) for the more general classes above will pre-processor automatically adds restrictions or filters to the 
imply the corresponding results for the more restricted class 30 user ' s ori S inal WW to ensure that the query is of a form that 
of active-domain conjunctive queries. wul retum onl y fimte results when provided to the spatial 
II. Data Structure Qualities database engine 18. This operation is referred to herein as 
* As previously mentioned, the present invention is appli- range -restriction. It is to be appreciated that the added 
cable to databases containing spatial information. In par- restriction or restrictions may be redundant, that is, the query 
ticular embodiments to be discussed below in Sections III, 35 entered b y the user ma y have °een safe to begin with, 
IV and V, the particular type of data includes geometric however, by automatically adding the restrictions, the pre- 
constructs. However, the invention finds broader application Processor advantageously guarantees safety, 
in databases having a data model or structure that exhibits " 15 10 be appreciated that the following is a mathematical 
certain qualities. That is, analysis and safe query translation explanation of range -restriction according to the invention in 
according to the methodologies of the invention is appli- 40 lhe 00016x1 of the notation described in the previous sec- 
cable to databases having such qualities. These preferred Uons ' illustrative scenario will follow this explanation. It 
qualities will be described below. This section further shows 15 t0 be further appreciated that, given such detailed expla- 
that what kind of interpreted structure one uses does matter, natlon of the invention, one of ordinary skill in the art may 
when one studies query safety. As a by-product, we show generate software code for implementing such principles for 
that the state-safety problem is decidable over certain struc- 45 execution by a query pre-processor. 
tures. We study safe translations defined as follows. Ui ^formally describe the concept of range-restriction 
Definition 2.1 f° r databases over interpreted structures. It can be seen as a 

We say that there is a safe translation of (active- generalization, to arbitrary structures, of the idea of finite- 
semantics) first-order queries over M if there is a function ness dependencies. Consider a query (o (x) over a database 
<t>-4W< on (active-semantics) formulae such that for every 50 WDlch M a fimte ^ s of real numbers: 
<(>: 

(1) $ safe is safe, and ^(x,y)=3z[S(z)A(z>y)A( x>0 )A (x . x=2) A (y+ytez)] 

(2) If 4> is safe for D, then <t > [D]=<()^ ia dp]. This query defines a set of pairs of reals. It is safe, as the size 
A translation is canonical if ^ sa fj[u\°0 whenever <j> is not of its output is at most twice the size of the input. Moreover, 
safe on D. A translation is recursive if the function (j)-*^*/* 55 and this is a key observation, from the query $ and any 
safe is recursive. database S, one can compute an upper bound on the output 

It is known that there are domains over which safe [S]: indeed, every element in adom [S]) is either or 

translations do not exist. However, we restrict our attention a/2 for some element a e S. Equivalently, in this upper 

to active -domain quantification. To this end, the following bound every element is a solution to either x 2 ^ or 2x»a 

are generalized results. 60 when a ranges over S. That is, an upper bound on the result 

Pre-processing Construction 2.1 of a safe query is found as the set of roots of polynomials 

Let M be: (i) o-minimal; (ii) based on a dense order; (iii) with coefficients coming from the active domain of the 

admit effective QE; and (iv) have a decidable theory. Then database and a finite set of constants, 

it can be proven that there exists a recursive canonical safe This is essentially the idea of range-restriction: we find, 

translation of active-semantics formulae over M. 65 for a safe query, a set of formulae defining an upper bound 

The above construction is achieved in the following on the output. We are interested in how the underlying 

manner. Let a (x) be a formula defining the active domain structure interacts with queries. For example, the invention 
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provides that not only a set of range-restriction formulae (closed) interval. Using this, we construct a sequence of 

exists for any query over a well-behaved structure, but also, intervals as follows: I 0 «|ao> b 0 ], I^J 0 is an interval that is 

under some conditions, it can be found effectively. Further, contained either in I 0 0 ft (M) or in I 0 fl-ft (M) . At the 

the invention provides that the upper bound works not only jth step, l^cl^ is an interval that is contained either in l f _ x 

for safe queries, but for arbitrary queries, provided they are 5 O ft (M) or in 1^ fl-ft (M). Let 1=1*. Then, for any c, d 

safe on a given database. eiMhUr h^f: (A^\ 

It is to be understood that an analog of the set of roots of c A ' m 5i ^ * 0 ' ^ K ' } ' 

polynomials must be found when dealing with arbitrary Since I-[ct\ b'] c=[ao. b 0 ] and M * p H (c;lT) for all c E 

/ \ 1, we obtain that, for every c E [a*b'], there exists d E [a', 

structures (e.g., \SR, +, *,e*/).The solution to this is provided 10 _^ 

below by the model-theoretic notion of algebraic formulae b'] such that M ^ same (c, d; b ). That is, for some E 

in subsection (a). The range-restriction conversion technique _ m a . -* v . , , „ , , 

is provided in subsection (b). Then, in subsection (c), we a > M " 7 *}~^& b >>" ^ * ""P 0551 ^ b * 

give two examples: the pure case, where our main result construction of I. Tins proves that p s ^algebraic. 

trivially translates into a classical relational theory result, 15 For the converse, we let, for any p (x; y ), E consist of just 

and a much less trivial FO+POLY case, where we demon- R That is 8 (x* ~*) is* 

strate that the upper bound is a set of roots of polynomials. a ^ ^ s ^ ' 1S * 

Subsection (d) provides some extensions of the main result. _^ _^ 

(a) Algebraic formulae over o-minimal structures vV, x"-x , <x<x--*(3z-x , SzSx" A ~ , (P(x; 7 y ))) 

Here we consider formulae over M, that is, first-order 2 o We claim that p and P s are equivalent, if P is algebraic. Fix 

formulae in the language L(Q). We consider formulae <t>(~x; any "b of the same length as 7> and assume that p (a; ~b ) 

7) with distinguished parameter variables ~y; we use to holds, if p„ ( a; f ) does not hold, then there exist a'<a<a" 

separate those variables. Assume that Tis of length n and ~y < a <a" such that for every c E [a', a"], P(c;~b )~P(z; ~b ) 

is of length m. Such a formula (in the language Q and 25 

holds; thus p (c; b ) holds for infinitely many c, contradicting 

constants for elements U) is called algebraic if for each b „ n n . , . , 

_» algebraicity of p. Hence, p 3 (a; b) holds. Conversely, 

in U m there are only finitely many satisfiers of d>( x ; b ); that „ , -* . . . ~ „ , t*. . 

_ — — assume that p w (a; b) holds. If p (a; b) does not hold, then 

* * he »\ < ? U " I M ' «jf > 7 finit ?* A ff i0n 30 there is an interval containing a on which P (•;"?) does not 

of formulae is algebraic if each of its elements is algebraic. _^ ' 

We need a syntax for algebraic formulae that will be used hold. Indeed, "^p (M; b ) is a finite union of intervals, whose 

in the algorithms that guarantees safety. This syntax is complement is a finite set of points, so the above observation 

provided below. follows from the density. We now pick a'<a" such that p (°; 

Let E={£.i(x;~y ), . ■ • &t(x;7)} be a collection of formulae. 35 ~b ) does not hold on [a', a"]. Since P H (a;*b ) holds, we find 

Uu c E [a', a"] such that -»(P(a; ~b )~p(c; ~b )) holds, that is, 

P(c; b ) holds for c E [a', a"], which is impossible. Thus, we 

conclude that P (a; b) holds, proving that p and p s are 
40 equivalent, and finishing the one variable case. 

Now define' ^ or ^ e n^ti-variable case » we note mat algebraicity of 

p 5 implies that 4>'(7;7)«Ps (x,; 7) A • • • A Pz (x„;"y ) 

Pas (x;7) a Vx', x' x'^oc'^^zx'Sx-a games (x,z;"y )) x ; y ) is algebraic. Conversely, let (|>( x ; y ) be algebraic. 

. . „ . 45 Consider: 
Proposition 3.1 

Let M be an o-minimal structure based on a dense order. 

Then a formula <Kx;7) is algebraic (over M) if and only if *' <x ' ; " )=3xi " ' ' 3x <-i 3x ** ■ ■ ■ 3x -^ x « x - ; ? > 

there exists a collection of formulae S such that + is ^ Ut X (x; 7) be ^(x; 7) v ... ^(x; 7) . Obviously each 

equivalent to P 2 formula «7;7) is algebraic if and onlyjf fc ^ algebraic> an(J thus x(x; - } fe algebraic Rcn ^ x(x; 

there exists a collection of formulae E in variables (7; 7) 7) is equivalent to p s (x; 7) for some finite collection E of 

and a formula tp(7; 7) suchthat 4 is equivalent to P 2 (x n ; formulae ^ (x; ^ Note ^ tf ^ f } holds aod ^ fc ^ 

y £ • • . A P 5 (x ; y )Aap( x ; y ). 55 ^ of ^ men x( ^ holds and thus 3 / 

The one-vanable case can be proven as follows. Let E be v ^ ^ 

a collection of formulae, and assume that p s is not algebraic. b)holds. This shows that <|> is equivalent to p s (x t ;y) 

That is, for some "b over U, p.(M; "b )-{a |M N P^a; b)\ * ■ • • A P= ( x -7) A <t>(7;7) thus completing the proof. 

-* . . Corollary 3.1 

is infinite. Since M is o-minimal, P E (M; b ) is a finite union 60 If M is an o-minimal structure based on a dense order, and 

of points and intervals. Since < is dense, it means that there . 

<K x ; y ) is algebraic over M, then <j> is algebraic over any M' 

exist Oo<b 0 E U such that [oq, bcJc^M; b ). We now elementary equivalent to M. 

consider the formulae §' t - (x)«=^ (x; "b ) for all ^EE. Since ™s is proven in the foUowing manner. There exists a 

both &(M) and -ft (M] fare finite unions of intervals and 65 collection 0 f formulae S such that M ^ vTv7^(lT;7) 

< is dense, for every non-degenerate interval I, it is the case ^ ^ _^ 

that either I fl g' f (M) or I mg' f . (M) contains an infinite ~(<K^;7) A AiPs 7) • Hence the same sentence is true 
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in M\ Since M' is also o-minimal and based on a dense order, 
we get from the proposition that <(> is algebraic in M',too. 

(b) Main Pre-processing Construction M* m , «); that is, y(x; w^Vx 1 , x-x'-okx"— 3y[x'£y£x" 

We start with a few definitions. For a L(Q)-formulae y(~x ; A "* 8ai »e s (x, z; w)] 

y) and database D, let: Now r consists j ust of y> with ^ being distinguished 

parameters. We let Make_Safe (<|>) output ({y}, <)>). 
Y(D)-{ a|3"bGadom (D)" ' such that D * y(a; IT)} Since y is algebraic by proposition 3. 1, we must show that 

^ {a |DB c <t>(a)}-{a e T (DB) | DB t^(a)} for every 

If r is a collection of formulae in x ; +e,rar +ee y, define: 10 nonempty database for which <f> is safe. 

Assume otherwise; that is, for some nonempty D for 

TO) = {jriO). which (j) is safe, we have DB Nj>(a) but a * T (D). 

Let c 3 , . . . , c M be an enumeration of all vectors of the 

Note that if T is algebraic and finite, then V (D) is finite. " [ e °^ h c of " of of ' he ton ™- ^ 

Definition 3.1 (Range-restriction) M>0 ' Smce a * ( D >> we have that for each ' ' ™> 

Let M be an interpreted structure . A range-restricted query there exist a,-, a 1 , such that a ; <a«x , ) and M teame s (a,c;T ;) 

over M and a database schema SC is a pair Q=(r, <t>( x)), for aU ^Vj'/l , , 

-* -*. -*•-*, 20 Let b=max{a,}, b^maxja,}. We have b<a<b', and for 
where T is a finite collection {y 1 ( x ; y ), . . . , y m ( x ; y )} 

-* each c (of length of w) over the active domain, we have L 

of algebraic L(Q) formulae, and <j>( x ) is a L(SC, Q) query. _„ 

The semantics of Q is as follows: (<*. c )** 61 ( c > c ) for ever y c G P>. b 'l- From this > b Y a 

simple induction on the structure of the formula (using the 
_ L _„ 25 fact that z does not appear in any atomic formula R (. . .)), 

C*DHaer(D)|r*Ka)} _ _ J 

^rhatis,rprovidesanupperboundontheoutputofaquery; we obtain that D ►= 0(0, c)~a(c, c)forevery c over 
within this bound, a usual first-order query is evaluated. For adom ( D ) and everv c e [ b > b 'l> and thus D ^ <j>(a)~<t<c), 
example, let («x) be the FO+POLY query S (x) v( x >5). which implies that 4> is not safe for D. This contradiction 
Clearly, it is not safe. Let now y(x; y)s(x*x-y) and Q-({y}> 30 P roves correctness of Make_Safe for the one-variable case. 
<)>). Then, for any databases (which a finite set of the reals), This completes the proof for the one-variable case. We 
Q [S] is the set of those elements a such that a 2 6S and handle tne multivariable case by reducing to the one- 
either a e S or a>5. Clearly, this is a finite set. variable case. 

It is to therefore be observed that every range -restricted Ut Q( ) be a definable QX{Gnsion of M that has 

query is safe. „, AX . . quantifier-elimination. Note that M' is o-minimal. Let 

We call a range-restricted query (T,(|>) active-semantics if - . . . , z j be given, and define: 
(j) is an active-semantics formula. Note that T does not 
mention the database. As will be shown, range -restricted 

active queries characterize all the safe active-semantics <fc<z*)=3zi • • • 3^-1 3z,„, . . . Bz^z,, . . . , z^, z I+lJ . . . , z„). 

££££ SSSKE 40 ™- fe a L < sc > Q, > - ive fora ^ * « -* < h * D 

Let M be any o-minimal structure based on a dense linear * Vz-i|),. (z)~<|> ( . (z) for all D. Let ({yfe, w,)}, i|i B - (z,)) be 
order. Then, there is a function Make_Safe that takes as the output of Make_Safe on aj>,-. Since M" is a definable 
input an active-domain formula $(7), and outputs a range- extension, we can assume without loss that y,. is a Q-formula. 
restricted active query Q«(r, t|j) with the property that We now defirje: 
Make_Safe (<(>) is equivalent to <(> on all databases D for 

which $ is safe. Furthermore, if M has effective quantifier- y(7; w lt . . . , w^-frfo; w^^z,,; w„), 

elimination, then Make_Safe is recursive. _^ 

This may be achieved in the following manner. We deal 5Q where each w 4 - is of the same length as the vector of 
with the one-variable case first. Let: distinguished parameters in the formulae y,-. Finally, Make__ 

Safe (<|>) outputs ({y},<|)). To see that it works, first notice that 
algebraicity of all y, s implies algebraicity of y. Now assume 



<J>(z)=Qi Wj Eadom . . . Qj w A eadom.a(z, w) 



that D * <|)(a) where a=(a. lf . . . , aj. Then D »= ^ (a^), 



where each Q, is 3 or V and a (z, w) is quantifier-free, and 55 
all atomic subformulae R (. . .) contain only variables, and ^ for 800X6 veclor c of elements of the act1 ^ 
excluding z. Any formula can be transformed into such by domain, we have that y f (a,, +e,rar +ee ,) holds. Thus, if c 

adding existential quantifiers. Let E«{5,- (z, w) | . . . , is the concatenation of all c ,s, then y(a, c ) holds, showing 

k} be the collection of all Q-atomic subformulae of a. We 6Q ^ -g,^ where r={y} ^ ^ proof of , he 

may assume without loss of generality that the length of w multivariable case. 

is nonzero, and that E is nonempty (if this is not true for <(>, We finally notice that Make_Safe for one-variable for- 
take <(>aV w eadom (w-w) and transform it to the above mulae is recursive, no matter what M is. For the multivari- 
form). able case, to make it recursive, we need a procedure for 

~ - / u ~*\ . v a • %k /t / 65 converting natural-quantification formulae into active- 

^Define same s (a, b; w), as before, to be A ,-l* (a, quantifica f ion tom 2c. Such a procedure exists by Fact 1.2 

w)«|, (b, w)), and define y(x; w) to be: above. 
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Corollary 3.2 (Range-restricted-Safe) 

For any o-minimal structure based on a dense order, the 
class of safe active-semantics queries is the same as the class 
of range-restricted queries. 
Corollary 3.3 

For any o-minimal structure based on a dense order 
(decidable or not), the collection of safe queries is recur- 
sively enumerable. 

We finish this section with a proposition that follows from 
the special form of formulae in T, as established in the proof 
of construction 3.1. 

It can be observed that by a modification of construction 
3.1, the following is true. 
Construction 3.2 

Let M be o-minimal and based on a dense order. Let x ) 
be a first-order query. Then one can find a set T of algebraic 

formulae y(x; y ) such that, for any database D, if 4> [D] is 
finite, then adorn (<|(D]) c_ r [D]. 
(c) Example applying technique to polynomial constraints 
We now consider the real field, that is, with respect to the 
constraint language FO+POLY. One task is to find a more 
concrete representation of range-restricted queries over the 
real field. Intuitively, it should be sufficient to look for roots 

of polynomials p (x; a) where a ranges over tuples of 
elements of the active domain, as suggested by the example 
in the beginning of this section. However, even quantifier- 
free algebraic formulae do not give us directly this repre- 
sentation. Nevertheless, in accordance with the present 
invention, the following can be shown. 

Let p (x; y ) be a multivariate polynomial over the real 

field. Define Root (p, <x) as 0 if p (x; cx) is identically zero, 

and the set of roots of p (x; a ) otherwise. Given a collection 

P of olynomials {pj(x, y ), . . . , p m (x, y )} and a database 
D, let: 

w> s U U R °°*Pi>") 

Definition 3.2 

A query in polynomial range-restricted form is a pair (P, 
<j>), where P is a finite collection of multivariate polynomials, 
and <tfx lt . . . , x„) is a FO+POLY query. The semantics is 
defined as (P, <|>) [D]=4>[D]DP (D) n . 

Advantageously, the following proposition states one of 
the key aspects of the invention that enables the query 
pre-processor 16 (FIG. 1) to perform the analysis and 
translation operations explained herein. 
Construction 3.3 

For every FO+POLY query <j>, a collection of polynomials 
P can be effectively found such that $ and (P, <j>) are 
equivalent on all databases on which $ is safe. 

This may be achieved in the following manner. Given a 

query <(>( x ), effectively find a collection of algebraic for- 
mulae T~{yj(x; y )} f such that for any D for which is safe, 
adom ((j{D]) c^r(D). For each a and each yET, the set y[ 
a]={c| y(c; a)} is finite, and by o-minimality there is a 
uniform bound M such that card (y[ a ])<M for all a and 

yer. 

Now let y'y (x; y ), i<M , be defined as follows: M ^/c; 
a ) if either: (i) y,{ a ] has at least i elements, and c is the ith 
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element in the order <; or (ii) yj[ a ] is nonempty, has fewer 
than i elements, and c is the largest element of a ]; or (iii) 
y ; { a ] is empty, and c=0. Note that y'/x; y ) s are indeed L 
(+, *, 0,1, <) formulae. It is easy to see that each y'y (x; y ) 

defines a function / ^ST'"-* W where m is the length of ~y , 

by ( a )»c if and only if c is the unique element such that 

10 Yj (c; a) holds. Furthermore, this function is semi-algebraic 
and the following property holds: if <j> is safe for D, then 

adom (4>[DD is contained in U»vUa/iy ( a )» where a 
ranges over adom (D). 

15 It is known that each f 0 (y) is algebraic, that is, there 

exists a polynomial p /y (x, y ) such that p <y (x, y )=0 if and 

only if x=fJ y ). It is easy to see that for P={p t> |i<M, y,eT}, 
T(D) _c P (D) and P (D) is always finite. To complete the 
20 proof, we must show effectiveness. We can effectively 
construct T, and thus find M (by writing formulae saying 

that each y(x; y ) has fewer than M satisfiers for each y , and 
checking if it is true by applying QE; since it is true for some 
25 M, the process terminates). Hence, we can effectively con- 
struct y yS, and the procedure for finding p t> s is effective, 
(d) Extensions 
Suppose we are given two elementary equivalent struc- 

30 tures M and M', for example, ($R,+,<) and (q, +, <). 
Assuming M is o-minimal based on a dense order, so is M', 
and thus the characterization of a safe queries applies to M' 
as well. Thus, it may be proven that if M is an o-minimal 
structure based on a dense order, and is a safe active- 
•35 semantics query, then $ is safe in any structure elementary 
equivalent to M, i.e., M\ 

Given the above-described teachings of the invention, an 
illustrative scenario is presented below in the context of FIG. 
2 whereby a query pre-processor of the invention performs 
40 safe translation through range -restriction. It is to be appre- 
ciated that the above sections presented the invention in 
terms of first-order logic, or relational calculus, which is the 
standard theoretical database query language. Practical 
languages, and most notable, SQL, are based on relational 
45 calculus. The basic form of SQL statements is: 
SELECT<attribute list> 
FROM<relations list> 
WHERE<conditions> 
5Q SQL statements roughly correspond to conjunctive queries. 
For example, the following SQL statement: 
SELECT Fl Source, F2- Destination 
FROM Flights Fl, Bights F2 
WHERE Fl-Destination=F2Source 
55 selects pairs of cities from the flight database that can be 
reachable with at most one stopover. An equivalent conjunc- 
tive query is: 

6Q #x, y): 3u (Flights^ u)&Flights(u, y)) 

To achieve the power of safe first-order logic queries 
(without constraints), SQL statements are closed under set 
union, intersection and difference, and results of the queries 
can be saved for future use. 
65 The present invention shows how SQL design can be 
extended to deal with constraints. We now illustrate this 
below using the example shown in FIG. 2 
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Referring to FIG. 2, a flow diagram is shown of a method 
of automatic safe query translation in the context of geo- 
metric point data according to an illustrative embodiment of 
the present invention. In step 210, the user 12 enters a query 
"Unsafel" at the user interface 14. It is assumed that the 
spatial database includes a table (relation) "POINTS" which 
has one non-key attribute "Value," and the relation stores 
real numbers. To have access to arbitrary real numbers, we 
assume a hypothetical relation called R with one attribute 
called X; its elements are all real numbers. Of course it is not 
stored in a database, but queries are allowed to use this 
relation. Thus, for example, the query entered by the user 
may be: 

SELECT Xl-R X 

X2=POINTSlValue 

X3=POINTS2Value 

FROM R, POINTS POINTS1, POINTS POINTS2 

WHERE X1*(X2-1)=X3 
It is to be appreciated that query "Unsafel" is unsafe since 
if POINTS has values 0 and 1 stored in it, then any real 
number XI from R will satisfy the constraint X1*(X2-1)= 
X3, when X2=l and X3=0. Thus, the query will have to 
return all triples (x, 1, 0) where x is an arbitrary real number. 
Hence, the query is unsafe. 

Then, in step 220, the user interface 14 sends the query to 
the query pre-processor 16 which translates the original 
query and produces a query "Safe2" by adding a restriction 
to ensure that a finite result will be sent to engine 18. Thus, 
query "Safe2" is returned to the user 12 at the user interface 
14 as: 

SELECT Xl-RX 

X2=POINTSlValue 

X3=P01NTS2Value 

FROM R, POINTS POINTS1, POINTS POINTS2 
WHERE X1*(X2-1)-X3 
AND NOT EXISTS 

(SELECT Valuel-Pl- Value, Value2=P2Value 

FROM POINTS PI, POINTS P2 

WHERE Valuelol AND Value2=0) 

Then, in step 230 the user queries database engine 18 with 
"Safe2" via the user interface. The engine returns the 
appropriate response to the query whereby the response is a 
finite result. 

Advantageously, the query pre-processor effectively 
forces the original query entered by the user to be safe. This 
is accomplished by the pre-processor implementing the 
range-restriction operation, as explained above. The opera- 
tion effectively finds a set of polynomials that restrict the 
search range base associated with the engine 18. In other 
words, the query is modified to direct the engine to search 
for data in a certain range base. In effect, the polynomials are 
telling the engine how to increase the range to include the 
desired result. 

To do this, the pre-processor adds one or more restrictions 
that directs the engine 18 to look only for answers where the 
polynomials have roots equal to zero. Advantageously, this 
makes the query result finite since there are only a finite 
amount of locations where roots of a polynomial are zero. 

It is to be appreciated that the pre-processor may add 
filters even if the query as originally entered is safe. In such 
case, the filters are merely redundant to the restrictions 
originally included by the user. In such case, the pre- 
processor does not affect the result by adding one or more 
redundant restrictions since the original query would have 
returned a finite result anyway. 
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IV. Safe Query Analysis and User-Input Query Restriction 
In a second embodiment, the invention provides for 
taking an input query entered by a user and analyzing it with 
regard to safety. That is, according to the invention, the 

5 user-entered query is analyzed to determine if it has been 
restricted by a relation. If this is not the case, and therefore 
the original query is unsafe, then the user is prompted to 
enter one or more appropriate restrictions. Such analysis is 
accomplished in accordance with the query pre-processor 16 
(FIG. 1). As will be explained and illustrated in the context 
of FIGS. 3Aand 3B, the query pre-processor analyzes a user 
input query and, if the query is determined to be unsafe, 
prompts the user to add restrictions or filters to the original 
query to ensure that the query is of a form that will return 
only finite results when provided to the spatial database 

15 engine 18. 

It is to be appreciated that the following is a mathematical 
explanation of such analysis according to the invention in 
the context of the notation described in the previous sec- 
tions. An illustrative scenario will follow this explanation. It 

20 is to be further appreciated that, given such detailed expla- 
nation of the invention, one of ordinary skill in the art may 
generate software code for implementing such principles for 
execution by a query pre-processor. 

Safety of arbitrary calculus queries is undecidable even in 

25 the pure case, and of course it remains undecidable when 
interpreted functions are present. As will be explained 
below, the present invention provides that safety is decidable 
for Boolean combinations of conjunctive queries in the 

30 presence of an interpreted structure such as (%+, 0,1, <). In 
particular, safety of FO+POLY and FO+UN conjunctive 
queries is decidable. 

Recall that we are using CQ, UCQ and BCCQ for 
conjunctive, unions of conjunctive, and Boolean combina- 

35 tions of conjunctive queries (see Section I for their definition 
in the presence of an interpreted structure). Our proof will be 
by reduction to the containment problem, which is decidable 
for UCQs over certain structures. Note that CQs and UCQs 
are monotone (that is, D c D 1 implies <(>[D] cr. (t>[D']). Since 

40 there are non-monotone BCCQs, the class of UCQs is 
strictly contained in the class of BCCQs. 

The main result here is expressed as follows. 
Construction 4.1 
Let M be o-minimal, based on a dense order, decidable, 

45 and admit effective QE. Then it is decidable if a given 

BCCQ $(x) over M is safe. 

The proof is contained in the following sub-constructions. 
Recall that by containment (j> cr\|> we mean <|>[D] <=t|)[D] for 
any D. 
50 Sub-construction 4.1 

Let M be o-minimal and based on a dense order, and 4>( 

x) be a first-order query. Then there exists an active- 
semantics CQ uX x ) such that (|> is safe if and only if <|>ciu>. 
55 Sub-construction 4.2 

Let M be as in construction 4.1. Then containment of a 

BCCQ in a UCQ is decidable; that is, for a BCCQ <K"x) and 

a UCQ a|)( x ) it is decidable if (j>c.*u>. This continues to hold 
60 if both <j> and are active-semantics queries. 

The above two sub-constructions may be achieved as 
follows. Sub-construction 4.1 follows from construction 3.2: 

take r-ft^x; y ), . . . , y*(x; y )} given by the proposition; 

65 let Y-V.Yf and let ^( x «» • • • > *») te: 

37i e adorn . . . 3"y „ e adorn y(x x ; 7i) a ■ • ■ A (*»; 7 J 
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If c|>_cai>, then <(> is safe since all y t s are algebraic. If <(> is safe, Thus, safety of SELECT-FROM-WHERE SQL queries is 

then adorn (<|>[D]) cJ(D) for every D, which implies <(>.cip. decidable, even if one uses polynomial constraints, and has 

Sub-construction 4.2 is as follows: Given <(> and op, one access to the hypothetical relation R that refers to all real 

can effectively find a number k such that <)>_c \J> if and only numbers. 

if for every database D with at most k tuples, ♦[Dlci| > [D]. 5 Referring to FIGS. 3A and 3B, a flow diagram is shown 

This clearly implies the result, as the latter condition can of a method of safe query analysis m the of geo . 

be expressed as a L(Q) sentence. For example, if the schema metric point data accordiDg t0 an illustrative embodiment of 

contains one relational symbol S, this sentence is V x 1 . . . the present invention. In step 310, the user 12 enters a query 

T,vir.«KT)[{ir ( }/D]-«ir)[{T,.})/D] wh ere """^f S the user iat v face 14, For example ' the query 

_/ r\ /l i i) j tv^/li yLl io entered by the user may be: 

x ^J/D] is obtained from (|)( x ) by replacing each occurrence SELECT X1=R X 

of S (7) by y, (? -"x A and similarly for an arbitrary YoPOINTS -Value 

schema. One can now easily complete the sub-construction FROM R, POINTS 

using the decidability of M. WHERE (Xl>Y2+2) 

Note that every BCCQ a can be represented as Vi»l B (X,{ 15 Then, in step 320, the user interface 14 sends the query to 

TjA-IX^whereeachX.isCQandeach^-isUCQjthis the WW pre-processor 16 which analyzes the query, 

follows if one writes a as a DNF. We take k to be the According to the invention, the pre-processor determines 

maximum length of X,. (measured as the sum of the number that the original query is unsafe since it is not restricted by 

of atomic formulae and the number of quantified variables). 2 o a relation - The pre-processor then instructs the user interface 

Assume that d>[Dl is not a subset of op[D] for some D: that to prompt the user for a restriction on XI. 

r , . r _, In step 330, the user inputs "Xl 2 -POINTS-Value=0. In 

is, we have a a e^p] is not a subset of tJi[D]. Assume D this way> the ^ specifies that the engine should only 

t X t - (a)A-g. (a) and let X.<x )«3"y 3~z S adorn consider values x such that x-Vy for some y in 

POINTS Value. Thus, in step 340, the pre-processor gener- 
ates query "Safel" by adding the user-input restriction to the 
original query. In this way, the pre-processor is effectively 
restricting the output of the original query to values satis- 



/\)=V<Xj (x, y, z). Then, for some b" over U and c over 25 ates query "Safel" by adding the user-input restriction to the 

a /~r\\ * _ * j /"* ~*\ j «_ original query. In this way, the pre-processor is effectively 

adorn (D) we get D n A/-i «/a , b , c ). Consider those festrictin the of the orf mal tQ yalues ^ 

^^?f^^ Me ^^^! , ? h,50 ? e fying the restriction. "Safel" is as follows: 

form R(. . .) where R G SC, there is a tuple in D that satisfies * ppT Y1 p v 

it. Select one such tuple for each SC-atomic a ; ., and let D' be 30 v phimtc v 1 

D restricted to those tuples. Choose a set of at most length Y=POINTS -Value 

— * _> B FROM R, POINTS 

c tuples in D contain all the components of c , and add it WHERE QQ>Y+2) 

to D\ Let the resulting database be D". Clearly, it has at most Xl 2 -POINTS-Value«0 

k tuples. ^ 35 Th ei3j ^ ste p 35Q me user q Uer j es database engine 18 with 

Note that D" t A;-/ a / a > b, c ) and thus D"^ X^oi) "Safel" via the user interface. The engine returns the 

-> _> appropriate response to the query whereby the response is a 

since c .c adorn (D"). On the other hand, IT' M -% ( a ) by finite resulL M example of a response that may be returned 

monotonicity of Thus, we get that cT &j>[D"]-\p[D"] D y the engine to the user by at the user interface is shown in 

where D" has at most k tuples. This implies that each 40 block 350. 

counterexample to containment is witnessed by a i Advantageously, in this scenario, the query pre-processor 

k^lement database, and finishes the proof of correctness of analyzes a query entered by the user and tells the user 

sub-construction 4.2. whether or not the query is safe, as explained above. If the 

To complete construction 4.1, note that under the assump- query is determined to be unsafe, the user is then prompted 

tion on M, the CQ such that <t>c=xp is equivalent to safety 45 t0 eQ ter one or more restrictions. The pre-processor then 

of <J> can be effectively constructed — this follows from the generates a new query with the user-input restrictions, 

procedure for constructing T given in construction 3,2. The assuming the added restriction now makes the query safe. It 

construction can now easily be completed using sub- is to be appreciated that the pre-processor analyzes the query 

construction 4.2 with the restrictions added before declaring it safe. 

Corollary 4.1 50 To do this, the pre-processor determines whether the one 

It is decidable whether any Boolean combination of or more restrictions will direct the engine 18 to look only for 

FO+LIN or FO+POLY conjunctive queries is safe. Note, answers where the polynomials have roots equal to zero, 

however, that safety of CQs is not decidable over every Advantageously, this makes the query result finite since 

I t there are only a finite amount of locations where roots of a 

structure. For example, for \N, +, *, 0,1, <), decidability of 55 polynomial are zero. 

CQ safety would imply decidability of checking whether a V Effective Syntax Query Translation Via Preservation of 

Diophantine equation has finitely many solutions, which is Geometric Properties 

known to be undecidable. In a third embodiment, the invention implements safe 

Given the above-described teachings of the invention, an query translation where safety has a different meaning then 

illustrative scenario is presented below in the context of 60 in the previous scenarios and where the data considered is 

FIGS. 3A and 3B whereby a query pre-processor of the region data rather than point data. Here, finiteness of an 

invention performs query analysis and then informs the user output is not at issue, but rather the invention provides a 

whether the original query requires any restrictions or filters technique for creating a restricted query such that whenever 

to be added in order to make the query safe. We now switch a user-input query includes certain geometric object types, 

to the SQL syntax again. Recall that the basic SELECT- 65 e.g., triangles, the output from the spatial database engine 

FROM-WHERE statement corresponds to conjunctive will have the same object types. This is referred to herein as 

queries, and this remains true in the presence of constraints. the operation of preserving geometric properties. Such trans- 
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lation is accomplished in accordance with the query pre- „ tt p mr ^, . ^ 

processor 16 (FIG. 1). As will be explained and illustrated in follows: for any X e Obj m (Q), x|>[XH y | (M, X) * Uj( y )}. 

the context of FIGS. 4A and 4B, the query pre-processor Clearl y *rfX] e Obj(Q), if M has quantifier-elimination, 

translates the query to ensure preservation of the same Let C be a class of objects in Obj(Q). We say that a 

geometric properties in the results returned by the engine. 5 first-order query q preserves C if for any X €C, M>[X] G C. 

It is to be appreciated that the following is a mathematical For example, C can be the class of convex polytopes in 

explanation of such translation according to the invention in SAlg. 

the context of the notation described in the previous sec- Thus, tDe safet y question for constraint databases is the 

tions. An illustrative scenario will follow this explanation. It following. Is there an effective syntax for the class of 

is to be further appreciated that, given such detailed expla- 10 C-preserving queries? Below, we show a solution, 

nation of the invention, one of ordinary skill in the art may Definition 5.1 

generate software code for implementing such principles for The class c nas a canonical representation in Obj(Q) if 

execution by a query pre-processor. tnere is a recursive infective function g: N->N with com- 

Accordingly, in this section, we deal with constraint pitf able inverse, and for each^ n, two functions code„: 

databases that represent potentially infinite objects. The 15 2 ~* 2 and decode„: 2 -*2 , where m=g(n), such that: 

notion of safety over constraint databases is different: we are (1) decode„° code„ (x)»x if x E Obj„(Q); 

interested in identifying languages that guarantee preserva- (2) code„ (x) is finite if x e C; decode„(x) G C if x is finite; 

tion of certain geometric properties. (3) cod ^ F0(fi)KlefiDable on 0b j„(Q); 

To give a very simple example, assume that spatial objects ; ' , „ . * ' , „ . , 1 . 

* ' v y (4) decode,, is FO(Q)-definable on finite sets, 
stored in a database are convex polytopes in 91 . A simple 20 IntuitivdVf tne canonical representation is a finite repre- 
query "return the convex hull of all the vertices x with||x||<l" Nation of C within Obj(Q) that can be defined in first- 
does always return a convex polytope. This query must be order logic over M For example, an approach to obtaining 
written in a rather expressive language: it can be expressed a canomC al representation of convex polytopes would be to 
in FO+POLY but not FO+LIN. Now, our question is: can we compu t e their vertices. This suffices to reconstruct the 
ensure in some way that a class of FO+POLY programs 25 polytope> and the vertices can be defined b a firstK)lder 
preserves a given property, like being a convex polytope ? formula ^ actual representa tion proposition 5.1 below) is 
That is, can we find an effective syntax for the class of mdeed based OQ computing the vertices, 
queries that preserve certain geometric properties ? In accor- Nexl we show tha( canooical representations solve the 
dance with the invention, the answer to these questions is safety problem. We always assume that the set Q is recur- 
y es - 30 sive. 

First, we present a general scheme for enumerating such Construction 5.1 
queries in FO(M). Here M is some structure on the reals, not 

/ V Let M=\U, Q) be o-minimal, based on a dense order, 

necessarily \% +, *, 0,1, </. The approach is based on decidable, and have effective QE. Suppose C is a class that 

reduction to the finite case, and using our results about finite 35 has a canonica i representation in Obj(Q). Then there is an 

query safety. , t effective syntax for C-preserving FO(Q) queries; that is, 

We explain the invention for three geometric properties: (here ex ists a recursively enumerable set of FO(Q) queries 

(1) a convex polytope; (11) a convex polyhedron; and (iii) a that express exactly all C-preserving FO(Q) queries, 

compact semi-linear set in SR 2 (the latter are perhaps the This may be achieved in the following manner. Consider 

most often encountered class of constraint databases). 40 # » 

We then demonstrate that for unions of conjunctive aQ enumeration of all safe FO(Q) queries \$J (from Cor- 

FO+POLY queries, it is decidable whether they preserve ollary 3.2 we know that it exists). Let $ use the extra relation 

convex polytopes or compact semi-linear sets in tt 2 . symbol of arity m and assume that n is such that g(n)=m; 

To define a general framework for explaining queries that M ^ven the assumptions we can compute ^ that. Let * have 1 

preserve geometric properties, we recall some basic defini- 45 P"™f™> and *Z™ let k be such that g(k)»l. If n and k are 

tions on constraint (or finitely representable) databases. As found for a given ♦* we let ^ be: 
before, we have a language of some underlying structure M 

and a schema SC, but now m -relations in SC are given by decoded code„ 
quantifier-free formulae a(x lf . . . x m m)in L (Q). It is to be 



appreciated that we do not assume relational attributes for 50 This produces the required enumeration. So we have to 

/ \ check that every query above preserves C, and for every 

the sake of ease of notation. If M is \% +, 0,1, </, then sets C-preserving ip we can get <(> such that decode 0 $° code 

, £J „ - ... r/<v» * « ., \ , coincides with ip. The first one is clear: if we have X G C, 

so defined are called semi-linear; for \9L+, * 0,1, </ they A / V \ • « •* u xr j /wi • c u > 1 

„ , . , . . i r cc men code «( x ) is finite, hence 4>.f code„(X)l is finite too, and 

are called semi-algebraic. The query languages for con- 55 aL'a* ♦ u- « • V- 

. . . , . . & t , t t_ • JJ r applymg decode*, we get an object in C. 

straint databases are the same as those we considered for V/ *? u o 

finite nne.' FO 0\ For the converse > suppose we have a C-preserving query 

nnue ones, ru ism*, Mj. 0 bj„(Q)^Obj*(Q). Define a as follows: a-code, 0 q° 

If MolU, Qj is an infinite structure, let Obj(Q) be the decode,,. That is, a is a query Obj m (Q)-*ObjX^)- Given 

class of finitely representable databases over M, that is, this, notice that: 

60 

Obj(Q)-U„ <H ,Obj n (Q) and Obj„(Q) is the collection of 

subsets of U" of the form {(x,, . . . , x JM N a(x 19 . . . pcj} . a . codCrt=dccodCA . ^ r dccode>| . 
where a is quantifier-free first-order formula in L (Q). We 

use SAlg„ for semi-algebraic sets. on Obj„(Q). Thus, it remains to show that a is safe, i.e., 

Let S be an m-ary relational symbol, and let ^(y^ . . . , 65 preserves finiteness. Let X be a finite set in IT. Then 

y„) be a first-order formula in the language of S and Q. Then decode„(X) e C, decode M (X) c C. Since ^ is C-preserving, 

this query defines a map from Obj m (Q) to Obj„(Q) as we get that Y«^[decode„(X)] 6 Obj*(Q) is in C, too, and 
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thus code/Y) is finite. This proves finiteness of a, and , h fa - fa ^ direcUon f - ^ 7) = {7 + ^7M0}, 

concludes the proof of correctness of the construction. . ~ & - v 0 . v . , V , f . ' 1 4 , J 

v I is a face of Xq. Since Xq is polyhedral of hneality zero, the 

We now turn to examples in the case when Q=\+, *, 0,1, set of vertices and extreme directions is finite. By 

<); that is, we are looking for canonical representations in 5 (generalized) Caratheodory's theorem, every point z of Xq 

SAlg. Let CPH be the class of convex polyhedra (i.e., » a combination of at most n+1 vertices and extreme 

intersections of a finite number of closed halfspaces) and directions, 
CPT be the class of convex poly topes (i.e., bounded 

polyhedra). The basic facts on convex sets that will be used _^ -* -» 

in the proofs of the propositions below are known in the art, 10 K x i+ • ■ • +K x k+Jh y i+ ■ • +Mm y m> where \, ^>0, Xj+ . . . 

e.g., R. T. Rockafellar, "Convex Analysis," Princeton Univ. k+m = n+1 - 

Press, 1970 SU g gests me following coding scheme. As before, for 

Proposition 5.1 simplicity of exposition, we assume several coding relations, 

The class CPT has canonical representation in SAlg. This but they can be com bin e d into one easily. It is to be 

may be proven as follows. Given a convex polytope X in 9T, 15 understood that, in accordance with the mathematical 

-» -> definitions, all the concepts we use are first-order definable 

its vertices can be found as V(X)«{ x E3T | x£X- over the real fldd We use re i a ti 0 ns LINEAL^ each such 

T*conv(X-~x )}. Thus, vertices of convex polytopes are relation contains a canonical representation (roughly, an 

definable in FO (+, *,0, 1, <), because the convex hull of a orthogonal basis) of the lineality space of X, provided its 

finite set of points is definable, and, in view of Caratheodo- 20 dimension is k. That is, at most one of these relations 

ry's theorem, we have: actually contains some information. We then have the rela- 
tions Vert and ExtDir for storing vertices and extreme 

_» ~ _» -* — directions of Xq. Finally, we have a relation Points that 

V(X>{ iG | x ex, V x x „ +1 ex- x • x * conv ({ x 1( contains pomts mat do nol belong to L+Xo (recall that the 

. . . , x „ +1 })} 25 cQjjjjg scheme applies to any semi-algebraic set, so there 

We now define code„. To simplify the notation, we let it could be such points, and we need to record them for the 

produce a pair of n-ary relations, but it can be coded in a decode function). 

straightforward manner by one relation. If X«=conv(V(X)), Thus, to code (an arbitrary semi-algebraic) set X, we first 

then code„(X)=(V(A),0); otherwise, code„(X)=(9T, X). The 30 note that its lineality space UX)={y | V ~x EX x+"y EX} 

function decode,,: 2 R " x2 R " -»2 Rn is defined as follows: . « « T /vX i95 r - * i w ~Vt / V \ /~* ~\ ^ 

and its orthogonal L(X) 2 «{ y | V x EL(X)( x , y )=0} are 

definable in FO+POLY (note that one can define the inner 

U conv ^v ifr*R n product in FO+POLY). Furthermore, for each k^n, there 

decode n (Y t Z) = < fa, ...fafrY ex j s ts a FO+POLY sentence dim* expressing the fact that 

' z otherwise, L(A) is a subspace of 9T and its dimension is k. This is true 

because in FO+POLY we can test linear independence; thus, 

Clearly, decode,, 0 code„ is the identity function for any we can check if there exists a system of k linearly indepen- 

(semi-algebraic) set; these functions are also first order dent vectors in L such that every vector in L is a linear 

definable. If X is a polytope, V(X) is finite, and by Carath- 4 0 combination of mem - 

eodory 's theorem each point of X is contained in the convex Next > we show how lo compute LINEAL* and VertDir. 

hull of most n+1 vertices of X. Hence, card (code„ (X)) We fiist sketch the co6in ^ scheme for LINEAL*. The set 

icard (V(X))" +1 , which is finite. If (Y, Z) is finite, then L ( x ) is FO+POLY-definable. Assume that it is a 

decode,, (Y) is conv(Y), and thus a convex polytope. This k-dimensional linear space (which is tested by dim*). Let A„ 

proves the proposition. 45 De 80016 canonically chosen n-dimensional simplex of diam- 

Proposition 5.2 eter 1 sucn tnat tne origin has barycentric coordinates 
The class CPH has canonical representation in SAlg. This 

may be proven as follows. Let X be a convex polyhedron in (I IV 

ST. Then X=L+(X HL- 1 ), where L is its lineality space, 

defined as { y | V x 6S- y + x EX} (it is a subspace of SR") Consider intersection of L(X) with 1-dimensional faces of 

and L 1 - is the orthogonal subspace {~y | V"xEL-("x , "y )-0}. A * (unless L(X) is a line, in which case we consider its 

We shall use Xq for X HL 195 in this proof. It is known that intersection with 2-dimensional faces of A J. If the intersec- 

Xq is a convex polyhedron of lineality zero, that is, it tion is a point, we record that point; if it contains the whole 

„ , „ 55 face, we record both endpoints of the face. From the selected 

contains no line. By A+B we mean { a + b | cUEA, bEB}. points> find a Unearly ^pendent subsystem (note that it 

Note the difference between the translate X-~x=>{y-~x | can be done canonically, for example, by listing the vertices 

-» v1 44 , t . v -* and 1-dimensional faces of A„ in some order). It then serves 

y EX} and the set-theoretic difference X-x; we use x to a§ a b ^ of ^ ^ w » ^ fa ^ 

disUnguish between them 6Q ^ can be r £ 0 4 tructed in FO+PO LY from its basis. 

For Xo, define its vertices as x EXo such that x g kt™ #u«# „ M u-..- * * c *u i- i c 

^ ^ Now that we have a representation for the lmeal space of 

convpCo-x). A direction is given by a vector y and corre- X, and a first-order formula defining L x y we have a 

sponds to the equivalence class of rays which are translate FO+POLY-formula defining X^ Using it, we can compute 

of each other. Note that each direction can be canonically vertices: 

represented by ~y such that ||~ylM- A direction ~y is an 65 

-> V(Xo)-{x G Xo | »= 3x ( , . . . , x^ G Xo-x x econv({x u . . . , 

extreme direction of Xq if for some vertex x , the ray x„^})}. 
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Clearly, this is a first-order definition. Next, we find the set: 

E(XoX7l(7» 7) =1 3 x*GV (Xq)-1(x , ~y) is a face }. 

A subset Y of Xq is a face if every closed line segment in Xq 
with a relative interior point in Y has both endpoints in Y. 
Clearly, this is first-order definable, and thus E(Xo), the set 
of extreme directions of Xq, is first-order definable. 

Given two sets V and E in 9T, by conv (V, E) we denote 

their convex hull, that is, the set of elements of 9T definable 
as 

k m 

^ \ r x i +Y J i i J'~$j> where 
,=1 >i 

k 

J?i e V t Jj e £, ^ A, = 1, A,-, pj ■ £ 0 and k+mZn+l. 
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contained in X and contains x . Thus code(X) c_ <R 6 . For 
decode, we use: 

5 decode(Y)= {J conv({x,y,z}). 

Clearly, decode 0 code is the identity, decode is first-order 
10 definable, and decode(Y) is compact and semi-linear when 
Y is finite. Thus, it remains to show that Y(X) is finite and 
FO+POLY-definable. The former is well-known. For the 
first-order definition of V(X) we use the following result. Let 

X be a finite union of polyhedra (in 9T) and let B e ( x ) be 

15 the ball of radius G around z. Then for each ~x, there exists 
6>0 such that for any 0<E 1 , S^<b, we have: 

a* |Ja . [(x n a)) -x] = x +■ \J a - ax n * e2 (*)>-*] 



Again, this can be done in FO+POLY. 

We now describe code n . For a semi-algebraic set X, it 
produces a tuple of relations: 

(UNEALo, . . . , LINEAL,,, Vert, ExtDir, Point) 

as follows. It first determines, by computing L(X), UX) X 9 
V(Xo) and E(Xq) if it is the case that L(X) is a linear 
subspace of 9T and: 

X=LCX)+conv(V(Xo), EQQ). 

If this is the case, then LINEAL^ Vert, and ExtDir are 
produced as before, and Point is empty. Otherwise, Point 
coincides with X, and all other sets in the coding are taken 

to be 9t n . From the description above it follows that code- 
n is FO+POLY-definable. 

To compute decode,,, we first check if the first n+2 
relations in the code coincide with 18", and, if this is the case, 
output the last relation in the code. Otherwise, we use the 
nonempty LINEAL* with least k to compute a linear sub- 
space L of 9T generated by the vectors in LINEAL* (if all 

LINEAL* are empty, we let this subspace be { 0 })). Next, 
compute Y~conv(Vert, ExtDir). Note that both are 
FO+POLY-definable. Finally, return L+Y DL 1 ; this is 
FO+POLY-definable also. This completes the proof 

Let SLincomp be the class of compact (i.e., closed and 
bounded) semi-linear sets. We resolve this case for dimen- 
sion 2, that is, for non-convex two dimensional polygons. 
Proposition 5.3 

The class SLinComp 2 has canonical representation in 
SAlg 2 . This may be proven as follows. An object in SLin- 

Comp 2 is a finite union of convex polytopes in SR 2 — this 
easily follows from cell decomposition. Any such object X 
admits a triangulation that does not introduce new vertices. 
Thus, the idea of the coding is to find the set V(X) of vertices 
and use as the code triples of vertices (not necessarily 

distinct) (x,^T, T) with conv({lT,3r, T}) _cX. More 

precisely, a triple ( x , y , z ) belongs to code(X) if either x , 

y , z G V(X) and conv( x, y, z ) c_ X, or x«y = z and 
there is no triple of elements of V(X) whose convex hull is 



We denote this set by X( x ). Define the equivalence relation 

B x DV y e x z ^ X( y )=X( z ). Then the vertices of X are 
25 precisely the one element equivalence classes of a x . It is 
routine to verify that the above can be translated into a 
FO+POLY definition of vertices. This completes the proof. 

Summing up, we have: 
Theorem 5.1 

30 There exists a recursively enumerable class of FO+POLY 
queries that captures the class of CPT (CPH and 
SLinComp 2 , respectively) preserving queries. 

Given the above -described teachings of the invention, an 
illustrative scenario is presented below in the context of 
FIGS. 4A and 4B whereby a query pre-processor of the 

35 invention performs query translation by providing an effec- 
tive syntax query in response to a user-input query including 
certain geometric objects. The effective syntax query 
ensures that the same geometric properties associated with 
the objects in the original query are preserved in the results 

40 returned by the engine. In the following example, we show, 
using SQL syntax, how a user forms a query that preserves 
the property of being a union of triangles. Such a query 
would first compute vertices of its input, thereby creating a 
finite relation with six attributes (each vertex of a triangle is 

45 a point on the real plane and is thus described by two real 
numbers). The user then applies a query that was shown to 
be safe b the pre-processor and obtains another relation with 
six attributes. This relation is treated as coding of the output: 
each tuple describes a triangle in the output. Since there are 

50 finitely many six-tuples, the result of the query is a finite 
union of triangle. Below, we describe a safe query on finite 
relations with six attributes first, and then apply it to a 
geometric relation. 
Referring to FIGS. 4A and 4B, a flow diagram is shown 

5 5 of a method of effective syntax query translation in the 
context of geometric region data according to an illustrative 
embodiment of the present invention. In step 410, the user 
12 enters a query "SafeRelational" at the user interface 14. 
The query applies to a database schema with input table 

60 VERT, which is a finite relational table with 6 attributes: 
vlxval, vlyval, v2xval, v2yval, v3xval and v3yval. It is to 
be appreciated that "SafeRelational" outputs a table with six 
columns as well. In this example, the query "SafeRelational" 
entered by the user may be: 

65 SELECT UloR X, 
U2-RX, 
V2-R4X, 
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W1-R5X, 
W2-R6X, 

FROM R Rl, R R2, R R3, R R4, R R5, R R6, VERT 

WHERE Ul«VERTvlxval+5 AND 

U2*VERTvlyval*2 AND 

Vl=VERTv2xval+5 AND 

V2=VERTv2yval*2 AND 

Wl=VERTv3xval+5 AND 

W2-VERTv3yval*2 

In step 420, the user interface 14 sends the query to the 
query pre-processor 16 which accepts the query since it 
requires no modification to be made safe. In step 430, the 
user creates a schema Figure of type region (e.g., stores 
triangular regions in the a plane) and inputs some sample 
data. The user then, in step 440, forms query "SafeGeomet- 
ric: Apply SafeRelational to Vertices(Regions)". 

In response, the query pre-processor generates a spatial 
query "SafeGeometrici," which takes any region and 
returns the region obtained by taking all triangles making up 
the region, moving them 5 places to the right, and doubling 
their size. The user then, in step 450, applies "Safegeomet- 
ric" to the schema Figure by sending "SafeGeometric2" to 
the database engine via the user interface. 

Advantageously, in this scenario, the query pre-processor 
generates an effective syntax query (SafeGeometric2) that 
ensures that the geometric properties associated with the 
user's original query are preserved in the output returned by 
the spatial database engine. 

Although illustrative embodiments of the present inven- 
tion have been described herein with reference to the accom- 
panying drawings, it is to be understood that the invention 
is not limited to those precise embodiments, and that various 
other changes and modifications may be affected therein by 
one skilled in the art without departing from the scope or 
spirit of the invention. 

What is claimed is: 

1. A method for use in a database system, the method 
comprising the steps of: 

obtaining an original query entered by a user; and 
pre-processing the original query before submittal to a 
database engine associated with the database system 
wherein a result of the pre-processing operation is to 
ensure that the pre-processed query provided to the 
engine is safe: 
wherein the original query is a conjunctive query and the 
pre-processing step includes analyzing the original 
conjunctive query to determine if the original conjunc- 
tive query would return finite results from the database, 
and prompting the user to restrict the original conjunc- 
tive query, if the original query would not return finite 
results, by inserting at least one range-restriction in the 
original query, the range-restriction specifying an upper 
bound on the results to be returned by the database as 
a set of roots of polynomials with coefficients coming 
from an active domain of the database and a finite set 
of constants. 

2. The method of claim 1, wherein the database contains 
spatial data. 
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3. The method of claim 1, wherein the pre-processed 
query supports first-order logic and one or more polynomial 
constraints. 

4. The method of claim 1, wherein the pre-processed 
query supports first-order logic and one or more linear 
constraints. 

5. Apparatus for use in a database system, the apparatus 
comprising: 

at least one processor operative to obtain an original query 
entered by a user, and to pre-process the original query 
before submittal to a database engine associated with 
the database system wherein a result of the pre- 
processing operation is to ensure that the pre-processed 
query provided to the engine is safe wherein the 
original query is a conjunctive query and the pre- 
processing operation includes analyzing the original 
conjunctive query to determine if the original conjunc- 
tive query would return finite results from the database, 
and prompting the user to restrict the original conjunc- 
tive query, if the original query would not return finite 
results, by inserting at least one range-restriction in the 
original query, the range-restriction specifying an upper 
bound on the results to be returned by the database as 
a set of roots of polynomials with coefficients coming 
from an active domain of the database and a finite set 
of constants. 

6. The apparatus of claim 5, wherein the database contains 
spatial data. 

7. The apparatus of claim 5, wherein the pre-processed 
query supports first order logic and one or more polynomial 
constraints. 

8. The apparatus of claim 5, wherein the preprocessed 
query supports first-order logic and one or more linear 
constraints. 

9. An article of manufacture for use in a database system, 
comprising a machine readable medium containing one or 
more programs which when executed implement the steps 
of: 

obtaining an original query entered by a user; and 
pre-processing the original query before submittal to a 
database engine associated with the database system 
wherein a result of the pre-processing operation is to 
ensure that the pre-processed query provided to the 
engine is safe; 
wherein the original query is a conjunctive query and the 
pre-processing step includes analyzing the original 
conjunctive query to determine if the original conjunc- 
tive query would return finite results from the database, 
and prompting the user to restrict the original conjunc- 
tive query, if the original query would not return finite 
results, by inserting at least one range-restriction in the 
original query, the range-restriction specifying an upper 
bound on the results to be returned by the database as 
a set of roots of polynomials with coefficients coming 
from an active domain of the database and a finite set 
of constants. 
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